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ABSTRACT 


There  are  several  graphic  client/server  application  development  tools 
which  can  be  used  to  easily  develop  powerful  relational  database 
applications.  However,  they  do  not  provide  a  direct  means  of  performing 
queries  which  require  relational  joins  across  multiple  database  boundaries. 

This  thesis  studies  ways  to  access  multiple  databases.  Specifically, 
it  examines  how  a  “cross-database  join”  can  be  performed.  A  case  study  of 
techniques  used  to  perform  joins  between  academic  department  financial 
management  system  and  course  management  system  databases  was  done 
using  PowerBuilder  5.0. 

Although  we  were  able  to  perform  joins  across  database  boundaries, 
we  found  that  PowerBuilder  is  not  conducive  to  cross-database  join  access 
because  no  relational  database  engine  is  available  to  execute  cross-database 
queries. 
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I.  INTRODUCTION 


A.  BACKGROUND 

Organizations  use  database  systems  and  applications  to  process 
information  needed  to  conduct  daily  business  and  operations.  Historically, 
these  database  systems  were  individually  developed  to  meet  specific  needs. 
As  an  organization’s  information  requirements  changed  over  time, 
additional  applications  were  written  to  maintain,  process,  and  extract 
information  from  the  individual  databases.  Typically  however,  each  of 
these  databases  provided  only  a  part  of  the  total  picture  needed  by  an 
organization.  An  example  of  this  is  illustrated  in  Figure  1.  In  this 
example  integration  of  the  two  databases  makes  it  possible  to  perform  a 
query  to  retrieve  the  answer  to  the  question,  “How  many  sophomores  have 
books  checked  out?”.  However,  this  information  is  not  available  from 
either  of  the  preexisting  underlying  databases  in  this  example. 
Additionally,  as  a  result  of  the  way  that  databases  in  organizations  tend  to 
evolve,  it  often  becomes  necessary  to  maintain  databases  that  contain  data 
which  is  duplicated  in  other  databases.  This  leads  to  the  additional  problem 
of  how  to  deal  with  data  inconsistencies  between  the  differing  systems. 
This  situation  is  inefficient  both  from  a  management  perspective  and  from 
a  data  processing  perspective  since  it  requires  the  additional  overhead 
associated  with  maintaining  the  same  information  in  more  than  one 
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database.  It,  also,  makes  it  difficult,  if  not  impossible,  to  maintain 
consistency  and  concurrency  of  the  duplicate  data.  Manpower  intensive 
solutions  for  eliminating  inconsistencies  and  managing  concurrency  are 
costly  and  inefficient  when  compared  to  the  potential  of  automated 
solutions  to  the  problem.  If  an  organization  could  identify  effective  and 
efficient  automated  methods  for  integrating  existing  databases  it  could 
significantly  enhance  the  usefulness  of  the  information  it  already  maintains 
by  eliminating  database  organizational  inefficiencies  which  cause  data 
duplication  and  inconsistencies. 


Figure  1 .  Integrated  view  of  tables  from  disparate  databases 


B.  THESIS  OBJECTIVES 


The  purpose  of  this  thesis  is  to  determine  if  it  is  possible  to 
concurrently  access  multiple  independent  databases  within  an  organization 
in  a  way  that  provides  an  integrated  view  of  the  information  they  contain. 
Furthermore,  assuming  that  this  is  possible,  can  multiple  databases  be 
efficiently  and  effectively  integrated  into  a  single  cohesive  database 
management  system  to  provide  a  focal  point  of  access  for  database 
applications?  What  are  the  limitations  in  the  context  of  the  relational 
model  and  in  the  context  of  the  current  database  application  development 
technology?  In  the  process  of  addressing  these  questions  this  thesis  also 
provides  some  analysis  of  the  different  methods  which  might  be  used  to 
integrate  multiple  databases  and  lists  the  relative  advantages  and 
disadvantages  of  these  methods. 

C.  METHODOLOGY 

In  order  to  test  the  actual  ability  to  integrate  multiple  databases  a 
relatively  simple  subset  of  the  overall  larger  problem  was  selected.  The 
problem  model  was  built  around  a  university  academic  department 
organization  with  only  two  specific  database  applications,  each  supported 
by  a  dedicated  database.  PowerBuilder  5.0,  which  is  a  relational  database 
management  system  front-end  development  tool,  was  used  to  develop  a 
course  scheduling  and  management  application  and  supporting  database. 
This  application  was  designed  to  provide  a  university  academic  department 
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with  a  tool  to  manage  and  plan  course  scheduling  and  course  assignments 
to  teachers.  A  method  was  then  developed  to  integrate  the  course 
scheduling  and  management  database  with  a  pre-existing  academic 
department  financial  management  system  database  which  was  also 
developed  using  PowerBuilder  5.0.  This  method  was  then  evaluated  and 
conclusions  were  developed. 

D.  ORGANIZATION  OF  THE  THESIS 

This  thesis  is  organized  as  follows:  Chapter  II  presents  a  brief 
discussion  of  the  relational  database  model,  its  impact  On  database  design, 
its  strengths  for  managing  and  accessing  data,  as  well  as  its  limitations. 
The  primary  reason  for  this  discussion  is  to  set  the  frame  work  for  any 
limitations  which  may  occur  in  implementing  a  multiple  database  solution. 
Chapter  III  provides  an  overview  and  some  discussion  of  the  various 
theoretical  approaches  which  might  be  taken  to  develop  an  integration 
solution  as  well  as  potential  advantages  and  disadvantages.  Chapter  IV 
presents  a  case  study  of  one  of  the  approaches  discussed  in  Chapter  III  and 
evaluates  this  solution  against  the  desired  goals  of  database  integration. 
Chapter  V  looks  at  the  various  database  integration  performance  issues  in 
the  context  of  the  relational  model  and  considers  the  impact  of  these  issues 
in  terms  of  the  model’s  strengths  as  well  as  its  limitations.  Finally, 
Chapter  VI  will  present  conclusions  drawn  from  this  research. 
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II.  RELATIONAL  DATABASE  MODEL 


A.  RELATIONAL  MODEL  OVERVIEW 

The  relational  model  has  been  the  subject  of  a  great  deal  of  research 
since  its  introduction  by  Codd  in  the  early  70’s.  In  the  80’ s  it  began  to 
replace  many  preexisting  networking  and  hierarchical  database  systems  and 
is  currently  the  basis  of  most  commercially  available  database  management 
systems.  Although,  this  may  be  changing  as  more  and  more  organizations 
are  looking  for  ways  to  improve  their  information  management  and  access 
capabilities  through  the  use  of  object  oriented  solutions  such  as  the  ODMG 
database  model.  [Ref.  1]  In  the  relational  model,  data  is  stored  as  relations 
which  are  frequently  referred  to  as  tables.  However,  it  is  important  to 
remember  that  these  two  terms  are  not  completely  interchangeable  since  a 
relation  is  best  represented  as  a  mathematical  set  of  tuples  which  has  no 
ordering  on  its  members.  A  table  however  usually  has  some  sort  of  order 
imposed  on  its  members  since  it  represents  an  actual  file  or  some  other 
physical  implementation  of  an  instance  of  a  relation.  A  relation  or  table 
consists  of  rows  or  tuples.  Because  by  definition,  a  relation  is  a  set  of 
tuples,  each  row  is  distinct  and  corresponds  to  an  instance  of  an  entity  or  a 
relationship  which  exists  between  two  or  more  entity  instances.  The 
members  in  a  row  or  tuple  are  the  set  of  attributes  of  the  entity  or 
relationship  instance.  The  attributes  which  correspond  to  the  columns  of  a 
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table  are  the  properties  of  the  entity  or  relationship  instance.  Using  this 
basic  structure,  database  schemas  can  be  defined  by  a  set  of  relation 
schemas  or  definitions  and  a  set  of  integrity  constraints  placed  upon  those 
schemas.  There  are  three  fundamental  update  operations  for  a  relation  and 
they  are  limited  by  the  integrity  constraints  on  the  relation  schemas.  They 
are  MODIFY,  UPDATE,  and,  DELETE.  Although  Codd  originally  defined 
eight  relational  algebra  operations  in  the  relational  model  for  manipulation 
of  relation  instances,  essentially  only  five  of  these  operations  are 
primitive.  These  operations  are  SELECT,  PROJECT,  UNION, 
DIFFERENCE,  and  CARTESIAN  PRODUCT.  [Refs.  2,  3]  A  more  detailed 
discussion  of  relation  schemas,  integrity  constraints,  update  operations, 
and  relational  algebra  operations  can  be  found  in  [Ref.  4].  It  is  this 
relatively  simple  framework  which  has  formed  the  basis  of  a  large  number 
of  relational  database  management  systems  in  recent  years. 

B.  STRENGTHS 

•  The  relational  model  is  based  on  a  simple  and  uniform  data 
structure  and  is  therefore  easier  to  understand  than  other 
database  models  such  as  the  hierarchical  and  network  data 
models. 

•  It  is  strongly  founded  on  the  mathematical  concepts  of 
predicate  logic  and  set  theory  allowing  the  representation  and 


manipulation  of  data  to  be  formalized.  [Ref.  3]  This  is  the 
basis  of  SQL,  which  is  one  of  the  most  widely  used  query 
languages  in  database  management  systems. 

•  Allows  a  more  abstract  representation  of  a  database  than 
previous  models  such  as  the  hierarchical  and  network  data 
models.  [Ref.  3]  Therefore  it  is  easier  to  design  a  database 
model  which  more  closely  resembles  real  world  instances  and 
relationships. 

C.  LIMITATIONS 

•  Insufficient  semantic  completeness  or  expressiveness.  [Ref.  5] 
This  has  been  a  very  popular  topic  for  discussion  and  many 
have  proposed  semantic  modeling  extensions  to  the  relational 
model  to  improve  it.  [Refs.  5,  6,  7]  A  good  example  of  a 
semantic  modeling  extension  is  the  Entity-Relational  (ER) 
model.  It  is  one  of  the  most  well  known  semantic  data  models 
because  it  is  simple  and  easy  to  understand.  However,  when 
an  ER  model  is  converted  into  a  database  schema  it  may  often 
lose  its  resemblance  to  the  real  world  entities  and  relationships 
it  was  intended  to  represent.  [Refs.  3,  4] 

•  Mathematical  aggregate  functions  cannot  be  expressed  in 
relational  algebra.  In  order  to  specify  operations  such  as 
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SUM,  or  AVERAGE,  additional  operators  must  be  defined  by 
the  DBMS.  [Ref.  4] 

•  The  recursive  closure  operation  cannot  be  specified  in 
relational  algebra.  In  order  to  specify  a  query  to  retrieve  all 
instances  in  a  recursive  relationship,  some  form  of  looping 
mechanism  is  needed  as  well  as  a  means  to  specify  the  number 
of  recursive  levels  required  for  the  specific  query  based  on 
some  base  condition.  [Ref.  4] 

•  OUTER  JOIN  and  OUTER  UNION  operations  are  required  to 
extend  the  relational  JOIN  and  UNION  operations  to  handle 
joins  between  relations  or  tables  which  either  have  some  tuples 
or  records  which  contain  null  values  in  the  join  attributes  or 
have  tuples  or  records  which  are  not  union  compatible. 
[Ref.  4] 

•  The  relational  JOIN  can  be  a  very  time  consuming  operation 
and  is  frequently  the  cause  of  performance  degradation.  This 
is  because  it  requires  the  cross-referencing  of  a  tuple  at  a  time 
between  two  relations  and  there  are  many  possible  ways  to 
execute  this  operation.  As  a  result,  complex  queries  are 
usually  a  performance  bottleneck  for  large  database  systems. 
[Refs.  4,  7] 
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III.  INTEGRATION  METHODS 


A.  MULTIPLE  DATABASE  INTEGRATION 

What  is  meant  by  the  term  multiple  database  integration?  The  term 
integration  itself  means  to  make  into  a  whole  by  bringing  all  parts  together 
or  to  make  one  thing  a  part  of  something  else.  This  definition  is  applied  in 
the  context  of  database  design  in  this  thesis  to  mean  the  logical  or 
physical  combination  of  multiple  databases  into  one  database.  Integration 
of  multiple  databases  can  be  particularly  useful  when  the  same  information 
is  represented  by  data  in  multiple  databases  or  where  real  world 
relationships  exist  between  entities  represented  by  data  in  disparate 
databases.  In  fact,  within  the  context  of  a  relational  model,  some 
duplication  of  data  in  disparate  databases  is  actually  necessary  in  order  to 
create  logical  relationships  which  model  the  real  world  relationships  not 
yet  realized  within  the  organization’s  information  infrastructure.  Given  the 
above  meaning  for  multiple  database  integration,  data  in  disparate 
databases  can  then  be  accessed  from  a  single  vantage  point.  It  is  then 
possible  to  present  new  and  more  expanded  information  views  derived  from 
data  spread  over  multiple  databases  and  to  automate  more  efficient 
procedures  for  eliminating  inconsistencies  between  them.  Following  are  a 
taxonomy  of  strategies  which  might  provide  database  integration  solutions 
as  well  as  a  brief  discussion  of  probable  advantages  and  disadvantages. 
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B.  PHYSICAL  INTEGRATION 


The  physical  integration  of  a  database  is  conceptually  a  straight 
forward  process  and  consists  of  designing  a  single  database  which  contains 
the  union  of  all  information  contained  in  the  preexisting  databases  which 
are  to  be  integrated  (Figure  2).  The  actual  process  of  designing  this  new 


Figure  2.  Physical  Database  Integration 

database  can  be  very  difficult  because  it  requires  a  complete  system 
redesign.  It  will  almost  always  require  changes  to  any  preexisting 
applications  and,  in  some  cases,  these  changes  will  be  substantial.  The 
most  obvious  benefits  to  this  approach  over  other  integration  solutions 
involving  a  more  loosely  coupled  arrangement  are  better  performance  and 
easier  and  less  expensive  application  development  for  data  access  and 
retrieval  after  a  physical  integration  solution  is  achieved.  In  a  large 
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organization  with  very  large  information  management,  storage,  and  access 
requirements,  the  physical  integration  will  result  in  what  is  now  most 
frequently  referred  to  as  a  Very  Large  Database  (VLDB).  [Ref.  8]  In  this 
case  redesign  will  require  special  attention  to  ensure  that  a  number  of  new 
challenges  are  handled.  One  very  large  possible  downside  to  this  strategy 
is  that  it  may  not  be  as  extensible  as  other  more  modular  strategies  which 
maintain  the  existence  of  separate  databases.  Special  care  is  also  required 
to  achieve  a  solution  which  is  scaleable  in  order  to  accommodate  future 
expansion  to  handle  new  information  requirements  and  avoid  the  need  for 
costly  changes  and/or  complete  system  redesign.  Therefore,  in  terms  of 
system  maintenance  and  updates  as  a  result  of  changes  in  the  organization’s 
business  and  management  practices,  this  solution  could  easily  become  cost 
prohibitive.  In  other  words,  as  soon  as  the  next  new  requirement  for 
information  handling  and  access  is  identified,  a  strategy  of  physical 
integration  could  result  in  the  organization  being  back  in  the  same  situation 
it  was  in  when  it  started  out  on  the  path  to  integration. 

C.  LOGICAL  INTEGRATION 

1.  Database  Level 

The  strategy  behind  logical  integration  at  the  database  level  is  the 
use  of  a  middleware  layer  which  connects  to  each  database  separately,  yet 
presents  a  single  interface  to  the  application  layer  so  that  multiple 
databases  appear  to  applications  as  one  large  database  (Figure  3).  A 


11 


significant  advantage  to  this  approach  is  that  it  effectively  maintains  the 
same  level  of  simplicity  in  query  formulation  as  the  physical  integration 
strategy  while  at  the  same  time  allowing  the  use  of  existing  databases  with 
fewer  or  no  changes  to  existing  structure.  This  strategy  has  little  to  no 
effect  on  existing  applications  and  only  requires  changes  where  necessary 
to  accommodate  changes  in  their  underlying  databases  due  to  field  type 
inconsistencies  between  duplicate  data.  The  connections  between 
preexisting  databases  and  their  applications  are  not  affected.  This  strategy 
could  be  implemented  through  a  kind  of  universal  database  engine  which 
establishes  a  single  connection  to  each  database.  Its  job  is  to  receive 
cross-database  queries  from  an  application  and  parse  it  into  the  appropriate 
subquery  components  for  each  database  involved.  For  example,  using  the 
example  of  the  university  library  given  in  Chapter  1,  the  universal  engine 
would  take  the  query  based  on  the  question  “How  many  sophomores  have 
books  checked  out?”  and  divide  it  into  two  subqueries.  The  subquery  for 
the  library  database  would  be  based  on  the  smaller  question  “Does  student 
A,  who  is  a  sophomore,  have  any  books  checked  out?”  and  the  subquery  for 
the  student  admin  database  would  be  “Retrieve  a  list  of  all  students  who 
are  sophomores.”  The  universal  engine  then  performs  the  join  of  the  two 
intermediate  result  sets  and  applies  an  aggregate  function  to  return  the 
final  answer  to  the  original  question.  At  the  same  time  the  engine  should 
perform  query  optimization  based  on  some  acquired  knowledge 
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Integrated  Application 


Figure  3,  Logical  Integration  at  the  Database  Level 


of  the  underlying  databases  structures.  For  example,  it  is  probably  faster 
to  get  the  result  above  by  checking  the  library  database  for  the  existence  of 
a  particular  “student_id”  after  first  retrieving  a  list  of  sophomores  if  the 
university  has  a  very  large  number  of  students  in  comparison  to  the  total 
number  of  books  checked  out.  This  logical  integration  strategy  benefits 
largely  from  its  more  modular  approach  and  its  ability  to  leverage  existing 
information  infrastructure.  Its  modularity  allows  the  use  of  databases 
which  may  be  distributed  and  performance  can  benefit  from  a  greater 
degree  of  parallelism  since  different  parts  of  a  cross-database  query  can  be 
processed  simultaneously.  Additionally,  this  more  modular  and  loosely 
coupled  approach  allows  easier  modification  without  necessarily  affecting 
other  parts  of  the  organization’s  information  infrastructure.  This  can  allow 
an  organization  to  upgrade  its  systems  gradually,  and  has  more  flexibility 
to  accommodate  changes  as  the  organization’s  information  needs  change 
over  time.  The  key  to  the  success  in  this  strategy  is  the  universal  engine 
itself.  It  sounds  nice  theoretically,  but  it  requires  features  that  are  difficult 
to  achieve.  Fortunately  the  first  of  these  is  already  here.  The  open 
database  connectivity  (ODBC)  standard  established  by  Microsoft  is  the 
kind  of  common  interface  necessary  for  an  application  to  connect  to 
multiple  relational  database  management  systems.  The  more  difficult  part 
of  the  universal  engine  concept  is  the  capability  to  establish  a  knowledge 
base  from  the  connected  databases  making  it  possible  to  perform  intelligent 
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and  effective  query  optimization.  Because  this  solution  carries  with  it  the 
additional  overhead  of  the  middleware  layer  it  might  not  be  expected  to 
perform  as  well  as  a  more  tightly  coupled  approach,  yet  its  distributed 
architecture  may  actually  yield  better  overall  performance  particularly  in 
situations  where  there  is  heavy  user  load  or  when  handling  large  queries 
which  span  multiple  databases.  Resolving  data  type  mismatches  where 
cross-database  queries  are  involved  is  more  of  a  challenge  with  this 
integration  strategy  as  well  as  the  question  of  how  to  deal  with  duplicate 
data  which  leads  to  concurrency  and  consistency  problems.  Maintenance  of 
concurrency  and  consistency  between  duplicate  and  related  data  could  be 
handled  by  the  implementation  of  business  rules  in  the  middleware  layer 
for  data  which  is  related  across  database  boundaries.  How  this  would  be 
implemented  and  how  well  it  would  work  is  not  clear. 

2.  Application  Level 

Integration  at  the  application  level  could  be  accomplished  by  defining 

connections  within  an  application  to  preexisting  databases  as  illustrated  in 

Figure  4.  This  approach  is  similar  to  the  previous  logical  integration 

solution  in  that  it  is  a  more  modular  approach,  but  results  in  a  more  tightly 

coupled  solution  since  the  boundaries  become  a  little  more  blurred. 

Integration  of  the  information  contained  in  the  disparate  organizational 

databases  is  handled  completely  at  the  application  level.  Therefore,  all 

cross-database  query  optimization  must  be  handled  by  the  application 
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Integrated  Application 


Figure  4.  Application  Level  Integration 

developer.  This  makes  the  application  development  process  much  more 
complex.  The  effectiveness  of  this  approach  will  also  depend  to  a  great 


extent  on  the  robustness  of  the  application  development  environment. 
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Because  this  is  a  loosely  coupled  approach,  it  does  not  require  significant 
redesign  of  the  existing  system,  however  it  will  require  a  more  deliberate 
application  design  effort  to  resolve  data  conflicts,  minimize  duplicate  data, 
and  to  maintain  data  concurrency  and  consistency  between  the  different 
databases. 

D.  INTEGRATION  STRATEGY  CHALLENGES 

1.  Data  Type  Conflicts 

A  challenge  that  is  common  to  all  of  the  above  integration  strategies 
is  how  to  resolve  differences  in  data  types  between  cross-database  join 
fields.  The  differences  can  range  from  differences  in  field  length  to 
different  types  to  different  field  masks.  For  example,  dates  in  one  database 
might  be  defined  in  “dd/mm/yy”  format  and  in  “dd/mmm/yyyy”  format  in 
another.  Using  a  physical  integration  strategy,  these  differences  are 
resolved  in  the  design  of  the  integrated  database  and  resulting  changes  in 
data  types  and  field  lengths  will  most  likely  require  changes  to  existing 
applications.  However  special  care  must  be  taken  to  avoid  corruption  of 
data  during  the  process  of  migrating  data  from  preexisting  databases. 
Resolution  of  data  conflicts  in  the  logical  integration  strategies  will  most 
likely  require  changes  to  both  existing  databases  and  applications. 
However,  data  conflict  resolution  should  only  be  necessary  where  actual 
joins  across  database  boundaries  are  required.  These  cases  can  then  be 
resolved  as  they  are  encountered  during  the  integration  design  process. 
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2.  Duplicate  Data 

Another  common  problem  related  to  that  of  data  conflicts  is  the 
existence  of  duplicate  data.  As  we  stated  earlier,  one  of  the  goals  of 
database  integration  is  to  minimize  duplication  between  databases.  This  is 
resolved  automatically  through  proper  normalization  in  the  physical 
integration  strategy,  but  requires  special  attention  with  both  logical 
integration  approaches.  In  both  application  level  and  database  level  logical 
integration  approaches,  minimizing  duplicate  data  will  require  changes  to 
existing  databases  and  applications.  Integration  at  the  application  level 
will  require  a  more  substantial  redesign  effort  but  can  be  done  as  part  of 
the  actual  integration  implementation  while  at  the  database  level  fewer 
changes  to  applications  should  be  required. 

3.  Concurrency  and  Consistency 

Concurrency  and  consistency  issues  arise  primarily  from  the 
existence  of  duplicate  data  and  are  therefore  not  expected  to  present  any 
special  problems  for  the  physical  integration  strategy.  Apart  from  concerns 
over  wasted  storage  and  performance  inefficiencies,  concurrency  and 
consistency  issues  are  the  primary  reason  for  minimizing  duplicate  data  and 
they  present  significant  challenges  for  the  logical  integration  strategies.  In 
a  single  database,  consistency  is  handled  by  integrity  constraints  on  the 
database  schema.  The  problem  with  the  logical  strategies  is  there  is  no 
way  to  impose  actual  integrity  constraints  on  related  fields  in  different 
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databases.  Instead,  business  rules  must  be  used  to  maintain  consistency 
and  must  be  defined  in  the  application  for  application  level  integration  if 
this  is  possible  with  the  application  development  tool  in  use.  If  it  is  not 
possible,  then  the  benefits  of  integration  may  be  limited  to  read  only 
reports  for  decision  support.  Business  rules  to  maintain  consistency  should 
be  defined  in  the  universal  engine  for  logical  integration  at  the  database 
level.  Concurrency  controls  will  most  likely  have  to  be  implemented  using 
procedural  constraints  in  the  application  layer  and  may  also  be  limited  by 
the  application  development  tool  in  use. 
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IV.  CASE  STUDY:  INTEGRATION  OF  COURSE 
MANAGEMENT  AND  FINANCIAL  MANAGEMENT 

DATABASES 

A.  BACKGROUND 

1.  Financial  Management  System 

The  Financial  Management  System  (FMS)  is  a  preexisting  database 
application  developed  using  PowerBuilder  5.0.  Its  purpose  is  to  provide  a 
university  academic  department  with  an  automated  decision  support  tool 
for  managing  the  department’s  financial  resources  and  fiscal 
responsibilities.  The  Entity-Relationship  diagram  for  the  relevant  entities 
of  this  database  is  illustrated  in  Appendix  A  and  the  associated  schema  is 
contained  in  Appendix  B.  See  [Ref.  9]  for  full  details  of  this  application 
and  supporting  database. 

2.  Course  Management  System 

The  Course  Management  System  (CMS)  database  application 
manages  course  scheduling  and  course  assignments  to  instructors  and 
professors.  It  was  developed  using  PowerBuilder  5.0.  The  Entity- 
Relationship  diagram  for  the  CMS  database  is  illustrated  in  Appendix  C 
and  the  associated  schema  is  contained  in  Appendix  D. 

3.  InterDatabase  Relationships 

In  the  case  of  the  FMS  and  CMS  databases,  there  is  a  need  to 

accomplish  some  level  of  integration  since  many  of  the  professors  which 
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are  assigned  to  teach  classes  are  also  principle  investigators  (PI)  for  one  or 
more  research  projects.  This  results  in  a  natural  intersection  of  the 
information  contained  in  the  two  databases.  The  academic  department 
would  like  to  be  able  to  track  the  relationship  of  courses  taught  by  Pis 
since  there  is  a  business  rule  which  links  the  research  dollars  for  each  PI  to 
the  number  of  classes  he  or  she  is  required  to  teach.  On  further  inspection 
there  may  be  other  relationships  between  the  two  databases  which  could  be 
taken  advantage  of,  but  the  situation  described  above  was  sufficient  to 
answer  the  primary  questions  of  this  thesis.  We  wanted  to  demonstrate 
whether  or  not  it  was  possible  to  first  establish  client/server  connections  to 
multiple  databases  simultaneously  and,  if  so,  perform  inner  joins  on  tables 
from  each  of  those  databases. 

B.  INTEGRATION  TECHNIQUE 

1.  Create  CMS 

Design  and  create  the  CMS  application  and  supporting  database 
based  on  the  need  for  an  academic  department  to  schedule  courses  and 
make  course  assignments  to  professors  and  instructors.  The  CMS 
application  also  serves  as  the  integration  application  in  order  to  test  cross¬ 
database  join  techniques  which  can  be  used  in  logical  integration  at  the 
application  level. 
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2. 


Multiple  Database  Connections 


Initially  it  was  not  clear  if  and  how  multiple  simultaneous  database 
connections  could  be  created  within  an  integrated  database  application.  We 
discovered  that  this  is  a  relatively  simple  process  since  PowerBuilder 
provides  for  the  creation  of  multiple  “transaction”  objects  within  an 
application  which  can  then  be  used  to  represent  a  client  connection  to  a 
database  on  the  server.  The  code  in  Figure  5  is  an  excerpt  from  the 
“application  open”  event  script  contained  in  Appendix  E  shows  how  this 
was  done.  The  “cms_trCursor”  and  ‘fms_trCursor”  are  declared  as  gobal 
“Transaction”  type  variables  in  the  PowerBuilder  script  painter.  The  script 
statement  “cms_trCursor  =  SQLCA”  assigns  the  “cms  trCursor”  to  point  to 
the  default  database  connection  profile  “SQLCA”  for  the  CMS  application. 
The  script  statement  “fms_trCursor  =  CREATE  transaction”  creates  a  new 
“Transaction”  object  and  assigns  it  to  the  “fms_trCursor”.  The  necessary 
database  connection  profile  parameters  are  retrieved  from  the  database 
profile  sections  of  the  “PB.ini”  file  using  the  “ProfileStringQ”  function. 
The  two  Transaction  objects  are  then  connected  to  their  respective 
databases  using  the  “CONNECT”  statement. 
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//Initial  default  database  transaction 

//In  this  application  this  transaction 

//is  used  to  connect  to  the  CMS  database 

SQLCA.DBMS  =  ProfileString("PB.  INI", "Database", "DBMS"  , " ") 

SQLCA.DbParm  =  ProfileString("PB.INI","Database","DbParm", " ") 

//Transaction  cms_trCursor  declared  in  "declarerglobal"  menu 
cms_trCursor  =  SQLCA 

CONNECT  using  cms_trCursor; 

IF  cms_trCursor.SQLCODE  <>  0  THEN 
MessageBox("Connect  Error",  "CMS  connection  failed") 

HALT 
END  IF 

//Transaction  fms_trCursor  declared  in  "declareiglobal"  menu 
fms_trCursor  =  CREATE  transaction 

fmsJrCursor.DBMS  =  ProfileString("PB.INI","PROFILE  Fms","DBMS"  ,"") 
frns_trCursor.DbParm  =  ProfileString("PB.INI","PROFILE  Fms","DbParm"," ") 

CONNECT  using  fms_trCursor: 

IF  fhisJrCursor.SQLCODE  <>  0  THEN 

MessageBoxfConnect  Error","FMS  connection  failed!") 

HALT 

END  IF 


Figure  5.  Application  script  for  concurrent  database  connections. 


3.  Cross-database  Join  Query 

The  first  step  to  developing  a  method  or  technique  for  performing  a 
cross-database  join  is  to  create  a  standard  SQL  query  which  represents  the 
question  you  wish  to  answer  that  is  based  on  the  tables  in  the  relevant 
databases.  The  tables  are  identified  using  dot  notation  following  the 
database  name.  The  query  in  Figure  6  reflects  the  previous  example 
relationship  between  the  CMS  and  FMS  databases  and  returns  a  list  of 
faculty  members  who  are  Pis  for  only  one  research  project  and  who  are 
teaching  less  than  two  courses. 
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SELECT  faculty_id,  l_name,  f_name,  mi 
FROM  CMS. faculty  f,  FMS. employee  e 
WHERE  f.faculty_id  =  e . emp_i decode 
and  EXISTS  (SELECT  f.faculty^id 
FROM  CMS . teach  t 

WHERE  t .teach_fac_id  =  f.faculty_id 
GROUPBY  f.faculty_id 
HAVING  COUNT  f.faculty_id  <  2) 
and  EXISTS  (SELECT  e . emp_i decode 
FROM  FMS . pi  p 

WHERE  p.emp_id_code  =  e .  emp_id_code 

GROUPBY  e . emp_id_code 

HAVING  COUNT  e . emp_i decode  =  1) 

Figure  6.  Cross-database  query  between  CMS  and  FMS 


4.  Query  Subcomponents 

Once  the  proper  cross  database  query  is  identified,  it  is  broken  into 
subcomponent  queries  necessary  to  retrieve  data  from  the  separate 
databases.  These  subcomponents  the  will  form  the  basis  for  the 
PowerBuilder  “data  window”  objects  which  will  have  to  be  defined  and 
created.  The  first  subcomponent  identified  is  based  on  data  contained  in 
the  CMS  database  and  is  the  part  of  the  above  query  which  retrieves  all 
faculty  members  who  are  teaching  less  than  2  courses.  (Figure  7) 

SELECT  f.faculty_id 

FROM  CMS. teach  t,  CMS. faculty  f 

WHERE  t . teach_f ac_id  =  f.faculty_id 

GROUP  BY  f.faculty_id 

HAVING  COUNT  f.faculty_id  <  2 


Figure  7.  First  subcomponent  query 
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The  second  and  only  other  subcomponent  in  this  example  is  based  on  data 
contained  in  the  FMS  database  and  retrieves  all  employees  who  are  Pis  for 
only  one  research  project.  (Figure  8) 

SELECT  e .  einp_id_code 

FROM  FMS. pi  p,  FMS. employee  e 

WHERE  p.emp_i  decode  =  e .  emp_i  decode 

GROUP  BY  e. emp__id_code 

HAVING  COUNT  e . emp_i decode  <  2 

Figure  8.  Second  subcomponent  queiy 


5.  Data  Window  Objects 

Using  Powerbuilder’s  SQL  select  painter,  the  actual  queries 
representing  the  subcomponents  are  created  and  used  as  the  data  source  for 
separate  “data  window”  objects.  Figure  9  and  Figure  10  show  the  resulting 
data  source  queries  for  the  “d_teach_less_than_2”  and  “d_pi__less_than_2” 
data  window  objects  respectively. 


SELECT 

FROM 
WHERE 
GROUP  BY 

HAVING 
ORDER  BY 


"faculty" . "faculty_id",  "faculty" . "l_name"^ 
"faculty". "f_name",  "faculty" . "mi" 

"faculty",  "teach" 

("teach" . "teach_fac_id"  =  "faculty" . "faculty_id" ) 
"faculty" . "faculty_id",  "faculty" . "l_name", 
"faculty" . "f_name",  "faculty" . "mi" 

{  count ("teach" . "teach_fac_id")  <  *2’  ) 

"faculty" . "faculty_id"  ASC 


Figure  9.  Data  source  query  for  '*d_teach_less_than_2"  data  window. 
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SELECT 


"employee” .  "emp_id_code”,  "employee" .  "first_name"  , 
"employee"  .  "mi",  "employee" .  "last_name” 

FROM  "employee",  "pi" 

WHERE  ( "pi" . "emp_i decode"  =  "employee" . "emp_i decode"  ) 

GROUP  BY  "employee" . "emp_id_code"  ,  "employee" . "first__name", 
"employee" . "mi",  "employee" . "last_name" 

HAVING  (count ( "employee" , "emp_id_code")  =  1  ) 

ORDER  BY  "employee" . "emp_id_code"  ASC 

Figure  10.  Data  source  query  for  '’d_pi_less_than_2’'  data  window. 


6.  Putting  the  Pieces  Together  with  Windows  and  Scripts 

Once  the  data  windows  objects  are  completed,  they  are  placed  in  a 
PowerBuilder  “window”  using  the  “window  painter”  and  code  is  added  to 
the  window  “open”  event  script  which  implements  the  cross-database  join 
between  the  FMS  and  CMS  databases  as  reflected  in  the  WHERE  clause 
condition  (f.faculty_id  =  e.emp_id_code).  This  approach  is  illustrated 
graphically  in  Figure  11.  The  window  script  to  implement  this  approach 
uses  nested  “for  loops”  in  order  to  accomplish  the  join  cross-product  of  the 
“dw_teach_less_than_2”  and  “dw_pi_less_than_2”  result  sets.  (Figure  12) 
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dw_teach_less_than_2  dw__pi_less_than_2 


Figure  11.  Cross-database  join  illustration  using  data  windows. 


long  pi_number_of_rows 
long  teach_number_of_rows 
long  pijoopcount 
long  teachjoopcount 
string  piNum 
string  teachNum 

pi_number_of_rows  =  dw_piJess_than_2_list.RowCount() 
teach_number_of_rows  =  dw_fac_teach_less_than_2_list.RowCount() 

for  pijoopcount  =  1  to  pi_number_of_rows 

piNum  =  dwj3iJessJhan_2Jist.GetltemString(piJoopcount,1) 

for  teachjoopcount  =  1  to  teach_number_of_rows 

teachNum  =  dw_facJeachJessJhan_2Jist.GetltemString(teachJoopcount,1 ) 

IF  dw_employeeJinkJacultyJist.Retrieve(piNum, teachNum)  >=  0  THEN 
COMMIT  using  SQLCA; 

ELSE 

ROLLBACK  using  SQLCA; 

MessageBox("Retrieve", "Retrieve  error  -  employee  join  faculty  detail") 
END  IF 

next 

next 


Figure  12.  Nested  for  loop  script  to  perform  cross-database  join 
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7. 


Performance  Drag 


At  this  point  the  original  goal  of  performing  a  cross-database  join 
has  been  accomplished  with  little  difficulty;  however,  the  use  of  nested 
“for  loops”  has  a  significant  impact  on  performance.  In  this  example,  the 
result  in  the  “teach  less  than  2”  data  window  contains  7  records  and  the 
result  in  the  “pi  less  than  2”  contains  13.  This  results  in  91  retrieval  calls 
against  the  CMS. faculty  table  which  contains  74  records.  This  means  that 
6734  join  comparisons  are  required  to  get  the  final  result  which  in  this  case 
yields  only  one  name.  For  queries  over  large  tables  and  with  large 
intermediate  result  sets,  this  approach  quickly  becomes  unusable. 

8.  A  More  Efficient  T echnique 

What  is  needed  is  a  more  efficient  method  of  performing  the  same 

task.  If  the  work  done  in  the  script  can  be  reduced  to  a  single  loop  vice  the 

nested  dual  loop,  the  performance  will  be  significantly  improved.  This  can 

be  done  by  combining  the  supporting  subquery  for  the  “teach  less  than  2” 

data  window  with  the  supporting  query  for  the  “faculty_link_employee” 

data  window  and  making  using  the  resulting  query  the  new  data  source  for 

the  “faculty_link_employee”  data  window.  The  result  shown  in  Figure  13 

is  basically  the  same  subquery  for  the  condition  “teach  less  than  2”,  but  we 

have  added  the  join  condition  “faculty  id  =  :employee_id”  to  the  query’s 

“where”  clause.  The  string  “:employee_id”  is  the  retrieval  argument 

passed  to  the  data  window  inside  a  single  loop  in  the  “window”  script  and 
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is  as  shown  in  Figure  14.  The  number  of  retrieval  calls  now  required 
against  the  CMS. faculty  table  for  the  cross-database  join  has  been  reduced 
to  13  from  91  and  the  total  number  of  join  comparisons  required  inside  the 
loop  in  the  window  script  is  reduced  to  962  from  6734.  When  the  two 
different  windows  are  actually  run  in  the  application,  there  is  a  very 
noticeable  improvement  in  response  speed  of  the  second  technique  over 
that  of  the  first. 

SELECT  "faculty". "faculty_id",  "faculty". "l_name", 

"faculty". "f_name",  "faculty". "mi" 

FROM  "faculty”,  "teach" 

WHERE  ("teach" . "teach_fac_id"  =  "faculty" . "faculty_id"  ) 
and 

( "faculty" . "faculty_id"  =  :employee_id  ) 

GROUP  BY  "faculty". "faculty_id", 

"faculty" . "l_name", 

"faculty" . "f_name", 

"faculty" . "mi" 

HAVING  (  count ("teach". "teach_fac_id")  <  2  ) 

Figure  13.  Improved  data  source  query  for  data  window 
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long  number_of_rows 
long  loopcount 
string  piNUm 

number_of_rows  =  dwj3iJess_than_2Jist.RowCount() 

FOR  loopcount  =  1  to  number_of_rows 

piNum  =  dw_piJess_than_2Jist.GetltemString(loopcount,1) 

IF  dw_employee_link_facultyJist.Retrieve(piNum)  >=  0  THEN 
COMMIT  using  SQLCA; 

ELSE 

ROLLBACK  using  SQLCA; 

MessageBox("Retrieve",  "Retrieve  error  -  employee  join  feculty  detail") 
END  IF 

NEXT 

Figure  14.  Single  for  loop  script  to  perform  cross-database  join. 
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V.  PERFORMANCE  ISSUES 


A.  QUERY  OPTIMIZATION 

1.  Relational  Join  Minimization 

As  in  any  relational  database  the  join  operator  will  have  a  significant 
impact  on  the  performance  of  any  integration  solution.  In  the  case  of  the 
physical  integration  strategy  this  is  handled  to  a  large  extent  by  the  query 
optimization  rules  built  into  the  database  engine.  However,  some  attention 
is  still  required  by  the  application  developer  to  ensure  that  queries  are 
formulated  in  a  way  that  minimizes  the  number  of  intermediate  result  sets. 
In  a  strategy  of  logical  integration  at  the  database  level,  the 
implementation  of  query  optimization  rules  in  the  universal  engine  is  a 
means  of  minimizing  this  as  a  performance  bottleneck.  Integration  at  the 
application  level,  as  we  demonstrated  in  Chapter  IV,  requires  special 
attention  to  query  design  when  performing  cross-database  joins.  As  a 
result,  large  queries  are  probably  not  practical  as  a  rule  with  this  approach, 
and  probably  limit  the  usefulness  of  application  level  integration  to  smaller 
organizations  and  applications  base  on  smaller  databases. 

2.  Table  Size 

Although  there  are  many  ways  to  perform  a  join,  one  of  the  most 
common  is  the  nested  (inner-outer)  loop  approach,  often  referred  to  as  the 
brute  force  approach.  This  is  the  method  used  in  both  techniques 
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demonstrated  in  Chapter  IV.  We  are  forced  to  use  this  method  in  the  case 
study  because  it  deals  with  intermediate  result  sets  or  tables  represented  by 
data  windows  and  the  join  is  achieved  through  the  use  of  a  looping 
construct  where  the  retrieval  arguments  are  passed  to  the  data  window  for  a 
record  by  record  comparison  and  retrieval.  When  using  the  nested  loop 
approach,  query  optimization  also  becomes  a  function  of  which  table  is 
chosen  for  the  outer  loop  and  which  for  the  inner  loop.  A  factor  known  as 
the  “join  selection  factor”  affects  the  performance  of  a  join  and  depends  on 
the  equijoin  condition  of  the  two  tables  and  their  size.  The  equijoin 
condition  affects  the  percentage  of  records  in  a  table  which  will  be  joined 
with  records  in  the  other  file.  In  the  example  given  in  the  case  study,  as 
illustrated  by  the  “faculty”  window  screen  capture  in  Appendix  E,  there  are 
13  PI  employee  records  with  less  than  2  research  projects  retrieved  from  a 
table  of  88  records  in  the  FMS  database  while  there  only  7  faculty  members 
teaching  fewer  than  2  courses  retrieved  from  a  table  of  22  teach  records  in 
the  CMS  database.  Therefore,  we  must  perform  616  (7  x  88)  join 
comparisons  if  we  use  the  “PI  less  than  2”  intermediate  result  as  the  outer 
loop  versus  276  (13  x  22)  if  we  use  the  “teach  less  than  2”  intermediate 
result  set  as  the  outer  loop.  Even  though  the  numbers  in  this  example  are 
relatively  small,  it  is  sufficient  to  show  how  performance  can  be 
significantly  affected  if  one  table  is  significantly  larger  than  the  other  and 
only  a  small  percentage  of  the  records  in  that  table  are  selected  for  join. 


34 


B.  PARALLELISM 


Logical  integration  strategies  offer  the  ability  to  take  advantage  of 
parallelism  which  can  be  achieved  from  multiple  concurrent  database 
connections  and  can  realize  an  overall  performance  advantage  over  a 
physical  integration  strategy  depending  on  the  overall  system  network 
configuration  and  performance.  The  inherently  distributed  nature  of  the 
logical  approach  avoids  the  problem  of  server  overload  which  is  more 
likely  to  occur  with  a  single  physical  integrated  database  on  a  centralized 
server. 
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VI.  CONCLUSIONS 


Since  the  introduction  of  the  relational  data  model  and  the 
tremendous  growth  in  the  use  of  computers  in  recent  years,  the  use  of 
database  applications  as  a  tool  for  managing  information  has  grown  in 
popularity.  Organizations  have  discovered  that  they  now  have  many 
databases  and  applications  which  manage  access  to  them,  but  they  find  it 
increasingly  difficult  to  sift  through  the  large  amounts  of  information  at 
their  disposal  because  many  of  these  databases  contain  overlapping  data 
which  is  not  coordinated  in  any  useful  or  meaningful  way.  The  challenge 
now  is  how  to  integrate  existing  databases  so  that  information  access  and 
retrieval  among  numerous  data  systems  is  efficient  and  effective.  We  have 
considered  three  strategies  for  accomplishing  database  integration  and 
discussed  some  of  the  probable  challenges,  benefits,  and  limitations  of 
each  in  the  context  of  the  relational  data  model.  As  a  result,  we  are 
reminded  that  one  of  the  primary  sources  of  difficulty  in  relational 
database  retrieval  is  the  join  operation.  This  is  even  more  significant  when 
seeking  an  integration  solution  since  the  relational  model  is  not  conducive 
to  cross-database  join  access.  Using  PowerBuilder  5.0  we  showed 
techniques  for  performing  a  join  across  multiple  database  boundaries  as  a 
method  of  integrating  existing  databases  at  the  application  level.  This 
thesis  demonstrates  that  it  is  possible  to  perform  some  level  of  integration 
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at  the  application  level.  However,  this  approach  is  not  practical  for  use 
with  large  or  complex  queries  for  performance  reasons  making  it  unsuitable 
for  use  in  large  systems.  It  can  be  a  useful  means  of  performing  limited 
database  integration  between  smaller  applications  to  leverage  existing 
information  resources  to  provide  greater  decision  support  utility  or 


administrative  functionality. 


APPENDIX  A.  ENTITY  RELATIONSHIP  DIAGRAM  FOR 

THE  FMS  DATABASE 


This  entity  relationship  diagram  is  a  screen  capture  of  PowerBuilder 
5.0’s  graphic  representation  of  the  FMS  database  described  in  Chapter  IV. 
Only  tables  which  are  relevant  to  this  work  are  included. 
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Entity  relationship  diagram  of  FMS  created  using  S-designor  5.0 


AppModeler  Desktop. 
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APPENDIX  B.  FMS  DATABASE  SCHEMA 
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APPENDIX  C.  CMS  ENTITY-RELATIONSHIP  DIAGRAM 

This  entity  relationship  diagram  is  a  screen  capture  of  PowerBuilder 


5.0’s  graphic  representation  of  the  CMS  database  described  in  Chapter  IV. 


-  Entity  relationship  diagram  of  CMS  created  using  S-designor  5.0 


AppModeler  Desktop. 
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APPENDIX  D.  CMS  DATABASE  SCHEMA 
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APPENDIX  E.  CMS  SCRIPTS  AND  SAMPLE  SCREEN 

CAPTURES 


Script  for  implication  Open  Event 


//Initial  default  database  transaction 
//In  this  application  this  transaction 
//is  used  to  connect  to  the  CMS  database 

SQLCA.DBMS  -  Prof ileString ( ”PB . INI”, "Database", "DBMS”  ,  "  ") 
SQLCA.DbParm  =  Prof ileString ( "PB. INI", "Database", "DbParm" ,  "  ") 

//Transaction  cms^trCursor  declared  in  "declare: global"  menu 
cms_trCursor  =  SQLCA 

CONNECT  using  cms_trCursor ; 

IF  cms_trCursor.SQLCODE  <>  0  THEN 

MessageBox ( "Connect  Error",  "CMS  connection  failed") 

HALT 
END  IF 


//Transaction  fms_trCursor  declared  in  "declare : global"  menu 
fms^trCursor  =  CREATE  transaction 

fms_tr Curs or. DBMS  -  Profilestring ( "PB. INI", "PROFILE  Fms","DBMS",  "  ") 

fms_trCursor. DbParm  =  Profilestring ( "PB. INI", "PROFILE  Fms", "DbParm", "  ") 

CONNECT  using  fms_trCursor ; 

IF  fms_trCursor.SQLCODE  <>  0  THEN 

MessageBox ( "Connect  Error", "FMS  connection  failed!") 

HALT 

END  IF 


/*  let  user  control  the  toolbar*/ 
toolbarusercontrol  =  TRUE 
toolbartext  =  TRUE 

open (w_main) 
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Script  for  ^Application  Close  Event 


transaction  cms_trCursor 
cms_trCursor  =  SQLCA 

DISCONNECT  using  cms_trCursor; 

IF  cins__trCursor .  SQLCODE  <>  0  THEN 
ROLLBACK  using  citis_trCursor ; 

MessageBox ( "Disconnect",  cms^trCursor . SQLERRTEXT) 
END  IF 
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Script  for  w^faculty  "window"  Open  Event  with  nested  loop 


//  associate  each  data  window  with  a  database  connection 
//  transaction  object 

dw_fac_teach__less_than_2_list .  settransobject  (  sqlca  ) 
dw_pi_less_than_2_list .  settransobject  (fms_tr cursor) 
dw__employee_link_faculty_list .  settransobject  (  sqlca  ) 

//  disable  edit  control  for  each  data  window 
dw_pi_less_than_2_list. Modify ("datawindow. readonly  =  yes") 
dw_fac_teach_less_than_2_list .Modify ( "datawindow. readonly  =  yes") 
dw_employee_link_f aculty_list . Modify ( "datawindow . readonly  =  yes " ) 

//  retrieve  list  of  faculty  from  CMS  for  which  teach  instances.. 
//  are  less  than  two  dw_fac_teach_less_than_2__list  data  window 
IF  dw_f ac_teach_less_than_2_list. Retrieve  0  =  -1  THEN 
ROLLBACK  using  SQLCA; 

MessageBox ( "Retrieve”,  "Faculty  Teach  Retrieval  Error") 

ELSE 

COMMIT  using  SQLCA; 

END  IF 

//  retrieve  list  of  employees  from  FMS  for  which  PI  instances 
//  are  less  than  2  in  dw_pi_less_than_2_list  data  window 
IF  dw_pi_less_than_2_list .Retrieve ( )  =  -I  then 
rollback  using  fms_trCursor; 

messagebox ( "Error”, "PI  Employee  List  Retrieval  Error”) 

ELSE 

commit  using  fms_trCursor; 

END  IF 
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Script  for  w^faculty  "window"  Open  Event  with  nested  loop 
(continued) 


long  pi_number_of_rows 
long  teach_nuinber_of_rows 
long  pi_loopcount 
long  teach_loopcount 
string  piNum 
string  teachNim 

//  get  number  of  rows  in  pi  result  data  window 
pi_number_of__rows  =  dw_pi_less_than_2_list .  RowCount  ( ) 

//  get  number  of  rows  in  teach  result  data  window 
teach_number_of_rows  =  dw_f  ac_teach_less_than_2_list .  RowCount  ( ) 

for  pi__loopcount  =  1  to  pi_number_of_rows 

piNum  =  dw_pi_less_than_2_list .  GetItemString  (pi_loopcount ,  1 ) 

for  teach_loopcount  =  1  to  teach_number_of_rows 

teachNum  = 

dw_fac_teach_less_than_2_list .GetItemString (teach_loopcount, 1) 

IF  dw_employee_link__faculty_list .Retrieve (piNum, teachNum)  >=  0 
THEN 

COMMIT  using  SQLCA; 

ELSE 

ROLLBACK  using  SQLCA; 

MessageBox( "Retrieve”,  "Retrieve  error  -  employee  join 

faculty  detail”) 

END  IF 
next 

next 
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Script  for  w_faculty  "window"  Open  Event  with  single  for  loop 


//  associate  each  data  window  with  a  database  connection 
//  transaction  object 

dw__fac_teach_less_than_2_list , settransobject  (  sqlca  ) 
dw_pi_less_than_2_list .  settransobj  ect  ( fins_tr cursor) 
dw_einployee_link_f aculty_list .  settransobj  ect  {  sqlca  ) 

//  disable  edit  control  for  each  data  window 

dw_fac_teach_less__than_2_list .Modify { " da tawindow. readonly  =  yes") 
dw_pi_less__than_2_list.Modify ("datawindow.readOnly  =  yes") 
dw_employee_link__f  aculty__list .  Modify  ( "datawindow .  readonly  =  yes  " ) 

//  retrieve  list  of  faculty  from  CMS  for  which  teach  instances 
//  are  less  than  two  dw_fac_teach_less_than_2_list  data  window 
IF  dw_f ac__teach__less_than_2_list. Retrieve  0  =  -1  THEN 
ROLLBACK  using  SQLCA; 

MessageBox ("Retrieve",  "Faculty  Teach  Retrieval  Error") 

ELSE 

COMMIT  using  SQLCA; 

END  IF 

//  retrieve  list  of  employees  from  FMS  for  which  PI  instances 
//  are  less  than  2  in  dw__pi_less_than_2_list  data  window 
IF  dw_pi_less__than_2_list  .Retrieve  { )  =  -1  THEN 
ROLLBACK  using  fms_trCursor; 

messagebox ("Error", "Employee  List  Retrieval  Error") 

ELSE 

COMMIT  using  fms_trCursor; 

END  IF 


long  number_of_rows 
long  loopcount 
string  piNum 

number_of_rows  =  dw__pi_less_than_2_list .  RowCount  ( ) 

FOR  loopcount  =  1  to  number_of_rows 

piNum  =  dw_pi_less_than_2__list. Get Itemstring (loopcount, 1) 

IF  dw_employee_link_faculty_list . Retrieve (piNum)  >=  0  THEN 
COMMIT  using  SQLCA; 

ELSE 

ROLLBACK  using  SQLCA; 

MessageBox ("Retrieve", 

"Retrieve  error  -  employee  join  faculty  detail") 

END  IF 
NEXT 
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Screen  capture  of  w_faculty  window 
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